This repository contains the scripts, tutorials, and templates for the Rich Lab’s mainstream bioinformatic workflows. You can find some of the current recommended tutorial versions of these workflows linked below.

Tutorial Options

MetadataSetup.html

  • The purpose of a metadata file is to organize the potential independent or predictor variables for your analysis into a single table with one row per SampleID. Then, when you produce a set of potential dependent or outcome values, you organize those into a similar structure with the SampleIDs organized rowwise to streamline the process of matching predictor variables to outcome variables by SampleID.

  • It’s good practice to keep as much of your information in one tidy table as possible so that you can keep pulling from that source to further filter, wrangle and analyze without losing track of different versions and datasets over time.

  • That means you should brainstorm as many possible predictor variables you might use in downstream analysis as possible and organize them into one tidy table where each SampleID is matched to a value for every variable. You will end up ignoring most of these variables as you construct individual tests and visuals later, so consider this simply a rough draft of your information of interest.

  • This tutorial uses the Pygmy Loris dataset as an example. You may have a very different set of variables to organize for your own project.

Files Needed

  1. compilation_loris.tsv - produced by the SampleInventory.Rmd script
  2. All other files depend on your specific variables of interest
  • Mine mostly focus on diet trials and environmental context for the Pygmy Loris subjects

Files Produced

  1. samples_metadata.tsv - This file gets carried forward on several other scripts in this repository to help you build consistent predictor variables across datasets.

microbiome_tidy_alignments.html

  • Follow this tutorial to take new raw read alignment data produced by Epi2ME Lab’s wf-16s pipeline and convert it into fully formatted, tidy tables that you can normalize and analyze using microeco or a similar package.

Files Needed

  1. samples_metadata.tsv - produced by the MetadataSetup.Rmd script
  2. *_abundance_table_species.tsv - produced by the wf-16s pipeline
  3. *_wf-16s-report.html - produced by the wf-16s pipeline
  • The tutorial provides instructions and code to download one .csv file per demultiplexed sample from this .html file for a given sequencing run and merge those into a table with raw alignment data

Files Produced

  1. taxonomy_table.tsv - This table is required for microeco and several other metagenomic packages. It must contain exactly one row per taxon in your dataset.
  2. otu_table.tsv - This table is required for microeco and several other metagenomic packages. It must contain exactly one row per taxon and one column per sample in your dataset. The numeric values represent the raw count of reads mapped to a given taxon within a given sample. We will want to normalize these before analyzing/interpreting them.
  3. sample_table.tsv - This table is required for microeco and several other metagenomic packages. It must contain exactly one row per sample in your dataset. The columns contain metadata values that we may use as independent/predictor variables for interpreting our results.

microbiome_references.html

  • Follow this tutorial to use the NCBI accessions available in the read alignments you created in the previous script and use those to fetch one representative sequence per OTU in your dataset and then build a phylogenetic tree to match.

Files Needed

  1. alignments.tsv - raw alignment data gathered from the wf-16s output and organized into one table in the previous script

Files Produced

  1. representative_seqs.fasta - This contains one GenBank reference sequence for each taxon in your dataset that we can use to substitute for ASVs used in many approaches designed for short read 16S data.
  2. representative_seqs_tax4fun.fasta - This is the same as the previous file, except with different syntax to work with the Tax4Fun2 package.
  3. tree.newick - This is a phylogenetic tree produced from the previous fasta multiple sequence alignment.

General Script Advice

On Your First-Use of a Workflow

I created a secondary repository called workflows_first_use. This is where I will store some scripts/tutorials with specific instructions for setting up your working directory and environment for specialized packages (especially for microeco).

params option in the R Markdown Header

Here is what my default yaml header on most R Markdown documents in this repository look like:

---
output:
  html_document:
    theme:
      bslib: true
    css: journal.css
    toc: true
    toc_float: true
    df_print: paged
params:
  sampleset: "loris"
  seqrun: "hdz18"
                     
---

I use the params option for streamlining some automation across different scripts. You can use the sampleset setting under params in the header of this script to select which sampleset you will be working with. Everywhere in a chunk of code where I have written params$sampleset will be replaced with whichever strong you change your params value to in the header.

File Paths and the config Package

I also use config for streamlining and organization. If you want to do the same, you should create your own config.yml file in your project working directory. You can expand the content below to see what my example config.yml file looks like as well as the R script I use to integrate this with the params values for consistent file sourcing and writing across datasets.

default:
  setup: "setup/global/setup.R"
  conflicts: "setup/conflicted.R"
  functions: "setup/global/functions.R"
  packages: "setup/global/packages.R"
  inputs: "setup/default_inputs.R"
  fonts: "setup/global/fonts/FA6_Free-Solid-900.otf"
  test_prediction: "setup/microbiome/test_prediction/"
  visuals: "visuals/"
  tmp_tsv: "tmp/tmp_table.tsv"
  tmp_downloads: "tmp/downloads/"
  tmp_fetch: "tmp/fetch_references.txt"
  tmp_fasta3: "tmp/tmp3.fasta"
  tmp_fasta4: "tmp/tmp4.fasta"
  git_url: "https://rich-molecular-health-lab.github.io/"
  
isolates:
  samplesets: "salci" 
  minQC: 10
  
microbiome:
  micro_scripts: !expr list("setup/microbiome/functions.R", "setup/microbiome/inputs.R")
  setup_dir: "setup/microbiome"
  inputs: "setup/microbiome/inputs.R"
  packages: "setup/microbiome/packages.R"
  first_use_packages: "setup/microbiome/packages_first_use.R"
  functions: "setup/microbiome/functions.R"
  tax4fun: "setup/microbiome/Tax4Fun2_ReferenceData_v2/"
  Ref99NR: "setup/microbiome/Tax4Fun2_ReferenceData_v2/Ref99NR/"
  blast: "setup/microbiome/ncbi-blast-2.16.0+/bin"
  
swan:
  functions: "setup/read_processing/functions.R"
  loris_tax: "/work/richlab/aliciarich/bioinformatics_stats/data/loris/taxonomy/"
  uid_file: "/work/richlab/aliciarich/bioinformatics_stats/tmp/fetch_references.txt"
  logs: "/work/richlab/aliciarich/bioinformatics_stats/logs"
  ont_reads: "/work/richlab/aliciarich/ont_reads/"
  dorado_model: "/work/richlab/aliciarich/ont_reads/dna_r10.4.1_e8.2_400bps_sup@v5.0.0/"
  bioinformatics_stats: "/work/richlab/aliciarich/bioinformatics_stats"
  samplesheets: "/work/richlab/aliciarich/bioinformatics_stats/dataframes/sample_sheet/loris"
  scripts: "/work/richlab/aliciarich/bioinformatics_stats/batch_scripts/"
  accessions: "/work/richlab/aliciarich/bioinformatics_stats/tmp/accessions.txt"
  tmp_fasta1: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp1.fasta"
  tmp_fasta2: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp2.fasta"
  tmp_fasta3: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp3.fasta"
  tmp_fasta4: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp4.fasta"
  raw_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/hdz_raw"
  basecalled_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/basecalled"
  trimmed_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/trimmed"
  filtered_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/filtered"
  loris_mb_aligned: "/work/richlab/aliciarich/bioinformatics_stats/data/loris/taxonomy/refseqs_aligned.fasta"
  loris_mb_tree: "/work/richlab/aliciarich/bioinformatics_stats/data/loris/taxonomy/refseqs_tree.newick"

bats:
  sequencing:
    coverage: "visuals/bats_16s_depth_summary.html"
    depth_plot: "visuals/bats_16s_depth_hist.html"
  metadata:
    summary: "metadata/bats/samples_metadata.tsv"
    key: "metadata/bats/metadata_key.R"
    factors: "metadata/bats/factors.R"
    scripts: !expr list("metadata/bats/colors.R", "metadata/bats/metadata_key.R", "metadata/bats/factors.R")
  inventories:
    all_stages: "../read_processing/samples/inventories/bats/compilation_bats.tsv"
    collection: "../read_processing/samples/inventories/bats/samples_bats.csv"
    extraction: "../read_processing/samples/inventories/bats/extracts_bats.csv"
    libraries: "../read_processing/samples/inventories/bats/libraries_bats.csv"
    seqruns: "../read_processing/samples/inventories/bats/seqruns_bats.csv"
  barcodes_output: "../read_processing/samples/barcode_alignments/bats/"
  read_alignments: "data/bats/read_alignments"
  taxa_reps:
    aligned: "data/bats/taxonomy/refseqs_aligned.fasta"
    tree: "data/bats/taxonomy/refseqs_tree.newick"
    table: "data/bats/taxonomy/tax_table.tsv"

loris:
  day1: "2023-10-26"
  last: "2024-10-25"
  visuals:
    demographics: "visuals/loris/demography"
    coverage: "visuals/loris_depth_summary.html"
    depth_plot: "visuals/loris_depth_hist.html"
  AZAstudbooks:
    btp: "metadata/loris/Studbook/BTP25.R"
    btp_current: "metadata/loris/Studbook/BTP25.tsv"
    lifetable: "metadata/loris/Studbook/lifetable25.tsv"
    lifetabStatic: "metadata/loris/Studbook/lifetable25_static.tsv"
    load_data: "metadata/loris/Studbook/LoadStudbookData.R"
    functions: "metadata/loris/studbookFunctions.R"
    reactables: "metadata/loris/Studbook/reactables.R"
    living25: "metadata/loris/Studbook/LivingPop_2025.csv"
    living21: "metadata/loris/Studbook/LivingPop_2021.csv"
    historic21: "metadata/loris/Studbook/HistoricPop_2021.csv"
    institutions: 
      current25: "metadata/loris/Studbook/CurrentInstitutions_2025.csv"
      current21: "metadata/loris/Studbook/CurrentInstitutions_2021.csv"
      historic21: "metadata/loris/Studbook/HistoricInstitutions_2021.csv"
    working: "metadata/loris/Studbook/working_studbook_loris25.tsv"
    reactable_ready: "metadata/loris/Studbook/studbook_loris_reactableReady.tsv"
    timeline: "metadata/loris/Studbook/working_timeline_loris25.tsv"
  metadata: 
    scripts: !expr list("metadata/loris/colors.R", "metadata/loris/metadata_key.R", "metadata/loris/diet_schedule.R")
    bristol: "metadata/loris/bristols.tsv"
    studbook: "metadata/loris/subjects_loris.csv"
    summary: "metadata/loris/samples_metadata.tsv"
    key: "metadata/loris/metadata_key.R"
    factors: "metadata/loris/factors.R"
    foods: "metadata/loris/foods.tsv"
    proteins: "metadata/loris/proteins.tsv"
    fats: "metadata/loris/fats.tsv"
    CHOs: "metadata/loris/CHOs.tsv"
    Ash: "metadata/loris/Ash.tsv"
    vitamins: "metadata/loris/vitamins.tsv"
    reactable: "metadata/loris/loris_metadata_summary.html"
    sample_table: 
      identifiers: "metadata/loris/identifier_key.tsv"
      main: "metadata/loris/sample_table.tsv"
      merged: "metadata/loris/sample_table_merged.tsv"
  inventories:
    all_stages: "../read_processing/samples/inventories/loris/compilation_loris.tsv"
    collection: "../read_processing/samples/inventories/loris/samples_loris.csv"
    extraction: "../read_processing/samples/inventories/loris/extracts_loris.csv"
    libraries: "../read_processing/samples/inventories/loris/libraries_loris.csv"
    seqruns: "../read_processing/samples/inventories/loris/seqruns_loris.csv"
  outputs_wf16s: "data/loris/outputs_wf16s/"
  barcodes_output: "dataframes/barcodes/loris/"
  read_alignments: "data/loris/read_alignments"
  taxa_reps:
    aligned: "data/loris/taxonomy/refseqs_aligned.fasta"
    tree: "data/loris/taxonomy/refseqs_tree.newick"
    table: "data/loris/taxonomy/tax_table.tsv"
  abundance_wf16s: "data/loris/wf16s_abundance/"
  microeco: 
    dataset:
      main:
        keg: "microeco/loris/datasets/main/keg"
        njc: "microeco/loris/datasets/main/njc"
        fpt: "microeco/loris/datasets/main/fpt"
        tax: "microeco/loris/datasets/main"
      culi:
        keg: "microeco/loris/datasets/culi/keg"
        njc: "microeco/loris/datasets/culi/njc"
        fpt: "microeco/loris/datasets/culi/fpt"
        tax: "microeco/loris/datasets/culi"
      warb:
        keg: "microeco/loris/datasets/warble/keg"
        njc: "microeco/loris/datasets/warble/njc"
        fpt: "microeco/loris/datasets/warble/fpt"
        tax: "microeco/loris/datasets/warble"
    abund:
      main:
        keg: "microeco/loris/abundance/main/keg"
        fpt: "microeco/loris/abundance/main/fpt"
        njc: "microeco/loris/abundance/main/njc"
        tax: "microeco/loris/abundance/main"
      culi:
        keg: "microeco/loris/abundance/culi/keg"
        fpt: "microeco/loris/abundance/culi/fpt"
        njc: "microeco/loris/abundance/culi/njc"
        tax: "microeco/loris/abundance/culi"
      warb:
        keg: "microeco/loris/abundance/warble/keg"
        fpt: "microeco/loris/abundance/warble/fpt"
        njc: "microeco/loris/abundance/warble/njc"
        tax: "microeco/loris/abundance/warble"
    alpha:
      main: "microeco/loris/alphadiversity/main"
      culi: "microeco/loris/alphadiversity/culi"
      warb: "microeco/loris/alphadiversity/warble"
    beta:
      main:
        kegg: "microeco/loris/betadiversity/main/keg"
        fpt: "microeco/loris/betadiversity/main/fpt"
        njc: "microeco/loris/betadiversity/main/njc"
        tax: "microeco/loris/betadiversity/main"
      culi:
        kegg: "microeco/loris/betadiversity/culi/keg"
        fpt:  "microeco/loris/betadiversity/culi/fpt"
        njc:  "microeco/loris/betadiversity/culi/njc"
        tax: "microeco/loris/betadiversity/culi"
      warb:
        kegg: "microeco/loris/betadiversity/warble/keg"
        fpt:  "microeco/loris/betadiversity/warble/fpt"
        njc:  "microeco/loris/betadiversity/warble/njc"
        tax: "microeco/loris/betadiversity/warble"
    data:
      main:
        feature: "microeco/loris/datasets/main/feature_table.tsv"
        tree:    "microeco/loris/datasets/main/phylo_tree.tre"
        fasta:   "microeco/loris/datasets/main/rep_fasta.fasta"
        samples: "microeco/loris/datasets/main/sample_table.tsv"
        taxa:    "microeco/loris/datasets/main/tax_table.tsv"
      culi: 
        feature: "microeco/loris/datasets/culi/feature_table.tsv"
        tree:    "microeco/loris/datasets/culi/phylo_tree.tre"
        fasta:   "microeco/loris/datasets/culi/rep_fasta.fasta"
        samples: "microeco/loris/datasets/culi/sample_table.tsv"
        taxa:    "microeco/loris/datasets/culi/tax_table.tsv"
      warb:
        feature: "microeco/loris/datasets/warb/feature_table.tsv"
        tree:    "microeco/loris/datasets/warb/phylo_tree.tre"
        fasta:   "microeco/loris/datasets/warb/rep_fasta.fasta"
        samples: "microeco/loris/datasets/warb/sample_table.tsv"
        taxa:    "microeco/loris/datasets/warb/tax_table.tsv"

sample_sheets:
  compilations:
    bats:    "../read_processing/samples/sample_sheets/bats/nwr_combined_sample_sheet.csv"
    loris:    "../read_processing/samples/sample_sheets/loris/hdz_combined_sample_sheet.csv"
  nwr1: "../read_processing/samples/sample_sheets/bats/nwr1_sample_sheet.csv"
  hdz1:  "../read_processing/samples/sample_sheets/loris/hdz1_sample_sheet.csv"
  hdz2:  "../read_processing/samples/sample_sheets/loris/hdz2_sample_sheet.csv"
  hdz3:  "../read_processing/samples/sample_sheets/loris/hdz3_sample_sheet.csv"
  hdz4:  "../read_processing/samples/sample_sheets/loris/hdz4_sample_sheet.csv"
  hdz5:  "../read_processing/samples/sample_sheets/loris/hdz5_sample_sheet.csv"
  hdz6:  "../read_processing/samples/sample_sheets/loris/hdz6_sample_sheet.csv"
  hdz7:  "../read_processing/samples/sample_sheets/loris/hdz7_sample_sheet.csv"
  hdz8:  "../read_processing/samples/sample_sheets/loris/hdz8_sample_sheet.csv"
  hdz9:  "../read_processing/samples/sample_sheets/loris/hdz9_sample_sheet.csv"
  hdz10: "../read_processing/samples/sample_sheets/loris/hdz10_sample_sheet.csv"
  hdz11: "../read_processing/samples/sample_sheets/loris/hdz11_sample_sheet.csv"
  hdz12: "../read_processing/samples/sample_sheets/loris/hdz12_sample_sheet.csv"
  hdz13: "../read_processing/samples/sample_sheets/loris/hdz13_sample_sheet.csv"
  hdz14: "../read_processing/samples/sample_sheets/loris/hdz14_sample_sheet.csv"
  hdz15: "../read_processing/samples/sample_sheets/loris/hdz15_sample_sheet.csv"
  hdz16: "../read_processing/samples/sample_sheets/loris/hdz16_sample_sheet.csv"
  hdz17: "../read_processing/samples/sample_sheets/loris/hdz17_sample_sheet.csv"
  hdz18: "../read_processing/samples/sample_sheets/loris/hdz18_sample_sheet.csv"

barcode_alignments:
  compilations:
    loris:    "../read_processing/samples/barcode_alignments/loris/hdz_combined_barcode_alignment.tsv"
    bats:    "../read_processing/samples/barcode_alignments/bats/nwr_combined_barcode_alignment.tsv"
  nwr1: "../read_processing/samples/barcode_alignments/bats/nwr1_barcode_alignment.tsv"
  hdz1:  "../read_processing/samples/barcode_alignments/loris/hdz1_barcode_alignment.tsv"
  hdz2:  "../read_processing/samples/barcode_alignments/loris/hdz2_barcode_alignment.tsv"
  hdz3:  "../read_processing/samples/barcode_alignments/loris/hdz3_barcode_alignment.tsv"
  hdz4:  "../read_processing/samples/barcode_alignments/loris/hdz4_barcode_alignment.tsv"
  hdz5:  "../read_processing/samples/barcode_alignments/loris/hdz5_barcode_alignment.tsv"
  hdz6:  "../read_processing/samples/barcode_alignments/loris/hdz6_barcode_alignment.tsv"
  hdz7:  "../read_processing/samples/barcode_alignments/loris/hdz7_barcode_alignment.tsv"
  hdz8:  "../read_processing/samples/barcode_alignments/loris/hdz8_barcode_alignment.tsv"
  hdz9:  "../read_processing/samples/barcode_alignments/loris/hdz9_barcode_alignment.tsv"
  hdz10: "../read_processing/samples/barcode_alignments/loris/hdz10_barcode_alignment.tsv"
  hdz11: "../read_processing/samples/barcode_alignments/loris/hdz11_barcode_alignment.tsv"
  hdz12: "../read_processing/samples/barcode_alignments/loris/hdz12_barcode_alignment.tsv"
  hdz13: "../read_processing/samples/barcode_alignments/loris/hdz13_barcode_alignment.tsv"
  hdz14: "../read_processing/samples/barcode_alignments/loris/hdz14_barcode_alignment.tsv"
  hdz15: "../read_processing/samples/barcode_alignments/loris/hdz15_barcode_alignment.tsv"
  hdz16: "../read_processing/samples/barcode_alignments/loris/hdz16_barcode_alignment.tsv"
  hdz17: "../read_processing/samples/barcode_alignments/loris/hdz17_barcode_alignment.tsv"
  hdz18: "../read_processing/samples/barcode_alignments/loris/hdz18_barcode_alignment.tsv"

abund_wf16s_files:
  hdz1:  "data/loris/wf16s_abundance/hdz1_abundance_table_species.tsv"
  hdz2:  "data/loris/wf16s_abundance/hdz2_abundance_table_species.tsv"
  hdz3:  "data/loris/wf16s_abundance/hdz3_abundance_table_species.tsv"
  hdz4:  "data/loris/wf16s_abundance/hdz4_abundance_table_species.tsv"
  hdz5:  "data/loris/wf16s_abundance/hdz5_abundance_table_species.tsv"
  hdz6:  "data/loris/wf16s_abundance/hdz6_abundance_table_species.tsv"
  hdz7:  "data/loris/wf16s_abundance/hdz7_abundance_table_species.tsv"
  hdz8:  "data/loris/wf16s_abundance/hdz8_abundance_table_species.tsv"
  hdz9:  "data/loris/wf16s_abundance/hdz9_abundance_table_species.tsv"
  hdz10: "data/loris/wf16s_abundance/hdz10_abundance_table_species.tsv"
  hdz11: "data/loris/wf16s_abundance/hdz11_abundance_table_species.tsv"
  hdz12: "data/loris/wf16s_abundance/hdz12_abundance_table_species.tsv"
  hdz13: "data/loris/wf16s_abundance/hdz13_abundance_table_species.tsv"
  hdz14: "data/loris/wf16s_abundance/hdz14_abundance_table_species.tsv"
  hdz15: "data/loris/wf16s_abundance/hdz15_abundance_table_species.tsv"
  hdz16: "data/loris/wf16s_abundance/hdz16_abundance_table_species.tsv"
  hdz17: "data/loris/wf16s_abundance/hdz17_abundance_table_species.tsv"
  hdz18: "data/loris/wf16s_abundance/hdz18_abundance_table_species.tsv"

methods_16s:
  libprep_workflow: "'rapid16s'"
  dorado_model: "'dna_r10.4.1_e8.2_400bps_sup@v5.0.0'"
  min_length: 1000
  max_length: 2000
  min_qual: 7
  min_id: 85
  min_cov: 80
  kit_name: "'SQK-16S114-24'"
  tax_rank: "S"
  n_taxa_barplot: 12
  abund_threshold: 0
  loris:
    rarefy: 3500
    norm: "SRS"
    min_abund: 0.00001
    min_freq: 0.01
    include_lowest: TRUE
    unifrac: TRUE
    betadiv: "aitchison"
    alpha_pd: TRUE
    tax4fun_db: "Ref99NR"
    loris_rarefy: 3500
    keg_minID: 97
    
global            <- config::get(config = "default")
swan              <- config::get(config = "swan")
micro             <- config::get(config = "microbiome")
loris             <- config::get(config = "loris")
marmoset          <- config::get(config = "marmoset")
isolates          <- config::get(config = "isolates")
bats              <- config::get(config = "bats")
methods_16s       <- config::get(config = "methods_16s")
sample_sheets     <- config::get(config = "sample_sheets")
abund_wf16s_files <- config::get(config = "abund_wf16s_files")
barcode_alignments<- config::get(config = "barcode_alignments")

seqruns      <- seqruns %>% keep_at(params$sampleset)       %>% list_flatten(name_spec = "")
subject_list <- keep_at(subjects, paste0(params$sampleset)) %>% list_flatten(name_spec = "{inner}")
path         <- config::get(config = paste0(params$sampleset))

Custom Engines

I use custom language engines in some scripts that I named “terminal” and “bash”. If you see a chunk of code with {terminal, warning = FALSE} written where you would usually see {r} at the top of the chunk, then running the chunk should only print that code as a text string in this document. This just makes it easier for me to copy and paste the code directly into the terminal panel that I use in my R Studio window when running code through a remote server instead of my local R console. There are ways to set R Studio up to run code through multiple servers, but I find this the simplest way to switch back and forth while still keeping a record of the code that have used or changes I have made to it.

You can expand the content below to see the code that I run at the start of many of these scripts to initiate those language engines.

knitr::knit_engines$set(terminal = function(options) {
  code <- paste(options$code, collapse = "\n")
  
  params <- map(params, ~ if (is.atomic(.)) {list(.)} else {(.)}) %>%
    list_flatten()
  
  patterns <- list(
    params             = list(
      sampleset    = paste0(params$sampleset),
      seqrun       = paste0(params$seqrun),
      samplesheet  = as.character(sample_sheets[paste0(tolower(params$seqrun))])
    )            ,
    
    global            = global            ,
    path              = path              ,
    swan              = swan              ,
    micro             = micro             ,
    loris             = loris             ,
    isolates          = isolates          ,
    bats              = bats              ,
    methods_16s       = methods_16s       ,
    sample_sheets     = sample_sheets     ,
    abund_wf16s_files = abund_wf16s_files ,
    barcode_alignments= barcode_alignments
  )
  
  
  # Replace placeholders group by group
  for (group in names(patterns)) {
    placeholder_list <- patterns[[group]]
    for (name in names(placeholder_list)) {
      placeholder <- paste(group, name, sep = "\\$") # Match exact placeholder
      value <- placeholder_list[[name]]
      
      # Replace placeholders exactly and avoid breaking suffixes
      code <- gsub(placeholder, value, code, perl = TRUE)
    }
  }
  
  options$warning <- FALSE
  knitr::engine_output(options, code, out = code)
})


knitr::opts_chunk$set(message = FALSE,
               warning = FALSE,
               echo    = TRUE,
               include = TRUE,
               eval    = TRUE,
               comment = "")